3 research outputs found

    A lexical approach for taxonomy mapping

    Get PDF
    Obtaining a useful complete overview of Web-based product information has become difficult nowadays due to the ever-growing amount of information available on online shops. Findings from previous studies suggest that better search capabilities, such as the exploitation of annotated data, are needed to keep online shopping transparent for the user. Annotations can, for example, help present information from multiple sources in a uniform manner. In order to support the product data integration process, we propose an algorithm that can autonomously map heterogeneous product taxonomies from different online shops. The proposed approach uses word sense disambiguation techniques, approximate lexical matching, and a mechanism that deals with composite categories. Our algorithm’s performance compared favorably against two other state-of-the-art taxonomy mapping algorithms on three real-life datasets. The results show that the F1-measure for our algorithm is on average 60% higher than a state-of-the-art product taxonomy mapping algorithm

    Ontology population from web product information

    Get PDF
    With the vast amount of information available on the Web, there is an increasing need to structure Web data in order to make it accessible to both users and machines. E-commerce is one of the areas in which growing data congestion on the Web has serious consequences. This paper proposes a frame- work that is capable of populating a product ontology us- ing tabular product information from Web shops. By for- malizing product information in this way, better product comparison or recommendation applications could be built. Our approach employs both lexical and syntactic matching for mapping properties and instantiating values. The per- formed evaluation shows that instantiating consumer elec- Tronics from Best Buy and Newegg.com results in an F1 score of approximately 77%

    SCHEMA – An Algorithm for Automated Product Taxonomy Mapping in E-commerce

    Get PDF
    This paper proposes SCHEMA, an algorithm for automated mapping between heterogeneous product taxonomies in the e-commerce domain. SCHEMA utilises word sense disambiguation techniques, based on the ideas from the algorithm proposed by Lesk, in combination with the semantic lexicon WordNet. For finding candidate map categories and determining the path-similarity we propose a node matching function that is based on the Levenshtein distance. The final mapping quality score is calculated using the Damerau-Levenshtein distance and a node-dissimilarity penalty. The performance of SCHEMA was tested on three real-life datasets and compared with PROMPT and the algorithm proposed by Park & Kim. It is shown that SCHEMA improves considerably on both recall and F1-score, while maintaining similar precision
    corecore